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Abstract. A function on a discrete group is weakly combable if its discrete 
derivative with respect to a combing can be calculated by a finite state au- 
tomaton. A weakly combable function is bicombable if it is Lipschitz in both 
the left and right invariant word metrics. 

Examples of bicombable functions on word-hyperbolic groups include 

(1) homomorphisms to % 

(2) word length with respect to a finite generating set 

(3) most known explicit constructions of quasimorphisms (e.g. the Epstein- 
Fujiwara counting quasimorphisms) 

We show that bicombable functions on word-hyperbolic groups satisfy a 
central limit theorem: If n is the value of <j> on a random element of word 
length n (in a certain sense), there are E and <r for which there is convergence 
in the sense of distribution n~ 1 ^ 2 (<j) n — nE) — > 7V(0, <r), where iV(0, a) denotes 
the normal distribution with standard deviation a. As a corollary, we show 
that if Si and S2 are any two finite generating sets for G, there is an algebraic 
number Ai,2 depending on Si and S2 such that almost every word of length 
n in the Si metric has word length n ■ Ai 2 in the S2 metric, with error of size 
0(Vn). 



1. Introduction 

This paper concerns the statistical distribution of values of certain functions on 
hyperbolic groups, and lies at the intersection of computer science, ergodic theory, 
and geometry. Since Dehn, or more recently Gromov (e.g. |12j ) it has been standard 
practice to study finitely generated groups geometrically, via large-scale properties 
of their Cayley graphs. Geometric spaces are probed effectively by Lipschitz func- 
tions; in a finitely generated group one has the option of probing the geometry of 
the Cayley graph by functions which are Lipschitz with respect to both the left- 
and right-invariant metrics simultaneously. One natural source of such functions 
are homomorphisms to R. For instance, Rivin [19] and Sharp [22] have studied 
the abelianization map to H\ on a free group, and derived strong results (also, 
see Picaud 16 ). Unfortunately, many groups (even hyperbolic groups) admit no 
nontrivial homomorphisms to K or Z; however, in many cases such groups admit 
a rich (uncountable dimensional!) family of quasimorphisms — informally, homo- 
morphisms up to bounded error. Quasimorphisms arise in the theory of bounded 
cohomology, and in the context of a number of disparate extremal problems in 
topology; see e.g. [1] for an introduction. The defining property of a quasimor- 
phism implies that it is Lipschitz in both the left- and right-invariant metrics on a 
group, and therefore one expects to learn a lot about the geometry of a group by 
studying its quasimorphisms. 
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A class of groups for which this study is particularly fruitful is that of word- 
hyperbolic groups. As Cannon 5 and Gromov |12j pointed out, and Coornaert 
and Coornaert-Papadopolous elaborated ([BJ, [7J, [5]), a hyperbolic group can be 
parameterized (i.e. combed) by walks on a finite directed graph, and the (symbolic) 
dynamics of the (combinatorial) geodesic flow can be recast in terms of the theory 
of subshifts of finite type. "Most" groups (in a suitable sense) are word-hyperbolic, 
so this gives rise to a rich family of natural examples of dynamical systems with 
hyperbolic dynamics (in the sense of ergodic theory). Underlying this machinery is 
the fact that geodesies in a hyperbolic group can be recognized by a particularly 
simple class of computer program, namely the kind which can be implemented with 
a finite state automaton. 

Putting these two things together, one is naturally led to study weakly combable 
functions — informally, those whose (discrete) derivative can be calculated by a 
finite state automaton — and to specialize the study to the class of bicombable 
functions, namely those which are Lipschitz in both the left- and right-invariant 
metrics. 

Many naturally occurring classes of functions on word hyperbolic groups are 
bicombable, including 

(1) Homomorphisms to Z 

(2) Word length with respect to any finite generating set (not necessarily sym- 
metric) 

(3) Epstein- Fujiwar a counting functions (see [TU] ) 

Bicombability of a function does not depend on a particular choice of combing, or 
even a generating set. Moreover, the set of all bicombable functions on a given group 
is a free Abelian group (of infinite rank). Although surely interesting in their own 
right, such functions are inexorably tied to the theory of quasimorphisms, in at least 
two ways. Firstly, the best-studied class of quasimorphisms on hyperbolic groups, 
the so-called Epstein- Fujiwara counting functions (i.e. bullet (3) above), turn out to 
be bicombable. Secondly, given any bicombable function <ft, its antisymmetrization 
ip defined by the formula ip(g) = <f>{g) — 0(g _1 ) is a quasimorphism. A particularly 
simple construction is as follows. Let S be a finite generating set for G (as a 
semigroup), and S 1-1 the set of elements inverse to S in G. Let (f>g : G — > Z compute 
word length with respect to the generating set S, and (f>s-i ■ G — » Z compute 
word length with respect to the generating set S" 1 . Then both <f>s and (ps- 1 are 
bicombable, and their difference tps := 0s — ^s- 1 is a bicombable quasimorphism. 
Typically ips will be highly nontrivial. 

The main result in this paper is a Central Limit Theorem (Theorem I4.25[) for 
bicombable functions on a word hyperbolic group. If is a bicombable function 
on G, and <f> n denotes the value of <j> on a random element in G of length n (with 
respect to some fixed generating set S and with a suitable definition of random to 
be made precise), there are real numbers E and a so that there is convergence in 
the sense of distribution 

n- x / 2 ($ n -nE)^N(0,a) 

where N(0, a) denotes the normal (Gaussian) distribution with standard deviation 
a (the possibility a = is allowed, in which case N(0, a) just denotes the Dirac 
distribution supported at the origin). If <p is a quasimorphism, then symmetry forces 
E — 0, so as a corollary one concludes that for any bicombable quasimorphism <j> 
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on a word-hyperbolic group, and for any constant e > 0, there is an N and a K 
such that for all n> N, there is a subset G' n of the set of all words G n of length n 
so that \G' n \/\G n \ > 1 - e, and \4>{g)\ < K^/n~ for all g £ G' n . 

Another corollary (Corollary I4.27P of independent interest to geometric group 
theorists, is the following. Let S\ and S2 be two finite generating sets for a non- 
elementary word hyperbolic group G. Then there is an algebraic number Ai,2 
such that for any e > there is a constant K and an N so that if G n denotes 
the set of elements of length n > N in the Si metric, there is a subset G' n with 
|Cral/|Cn| > 1 — e, so that for all g € G' n there is an inequality 

llslsi - Ai, 2 |g|s 2 | <K-yfn 

(of course, \g\s x = n in this formula). 

This observation can probably be generalized considerably. For instance, if H is 
a quasiconvex subgroup of a hyperbolic group G, then word length in G (in some 
generating set) is bicombable as a function on H. It follows that for any finite 
generating sets Sh for H and Sq for G, there is some constant A such that almost 
every word of length n in H in the Sh metric has length A • n in G in the Sg metric, 
with error of order y/n. 

We say a word about the method of proof. In general terms, to derive a central 
limit theorem for random sums in a dynamical system one needs two ingredients: 
independence (i.e. hyperbolicity) and recurrence. Informally, one obtains a Gauss- 
ian distribution by adding up independent samples from the same distribution. In 
our context hyperbolicity, or independence of trials, comes from the finite size of 
the graph (and the finite memory of the automaton) parameterizing elements of the 
group. As is well-known to the experts, the difficult point is recurrence: a graph pa- 
rameterizing a combing of a hyperbolic group may typically fail to be recurrent. In 
fact, it may have many recurrent subgraphs whose adjacency matrices have largest 
real eigenvalue equal to the largest real eigenvalue for the adjacency matrix of the 
graph as a whole. In place of recurrence in this graph, we use recurrence at infinity, 
using Coornaert's theorem that the action of G on dG is ergodic with respect to a 
Patterson-Sullivan measure. 

Statistical theorems for values of (certain) quasimorphisms on certain groups 
have been derived by other authors. Horsham-Sharp ([13]) recently showed that 
a Holder continuous quasimorphism (see § [5]) on a free group F satisfies a central 
limit theorem. In quite a different direction, Sarnak [21] showed that the values of 
the Rademacher function on PSL(2,Z), sorted by geodesic length, form a Cauchy 
distribution. Here the Rademacher function is a correction term to the logarithm 
of the Dedekind eta function, and arises in elementary number theory. Thinking of 
PSL(2, Z) as a subgroup of the group of isometries of the hyperbolic plane, every 
element whose trace has absolute value greater than 2 is represented by an isometry 
with an axis, with a well-defined (geodesic) translation length. The reason for the 
appearance of the Cauchy distribution instead of a Gaussian one has to do with 
the nonlinear relation between geodesic length and word length in PSL(2,Z), and 
the noncompactness of the quotient H 2 /PSL(2, Z). Of course, the Rademacher 
function is bicombable, and therefore our main theorem shows that its values have 
a Gaussian distribution with respect to any word metric on the word-hyperbolic 
group PSL(2,Z). 
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The plan of the paper is as follows. In § 2 we introduce basic definitions and 
properties of word hyperbolic groups and finite directed graphs (which we call 
digraphs, following the computer scientists' convention). In § 3 we define combable 
functions and establish some of their basic properties, proving that the Epstein- 
Fujiwara counting quasimorphisms on hyperbolic groups are bicombable. In § 4 we 
prove the central limit theorem and derive the main corollaries of interest. Finally 
in § 5 we briefly discuss the results of Horsham-Sharp. 

2. Groups and digraphs 

2.1. Hyperbolic groups. 

Definition 2.1. A path metric space is 5 -hyperbolic for some S > if for every 
geodesic triangle abc every point in the edge ab is contained in the union of the 
(^-neighborhoods of the other two edges: 

ab C N s (bc) U N s (ca) 

A group G with a finite generating set S is S -hyperbolic if the Cayley graph Cs(G) 
is (5-hyperbolic as a path metric space. 

G is word-hyperbolic (or simply hyperbolic) if there is a S > and a finite gener- 
ating set S for which Cs(G) is 5-hyperbolic. 

A word-hyperbolic group is elementary if it contains a finite index cyclic sub- 
group. If G is nonelementary, it contains many quasi-isometrically embedded non- 
abelian free groups of arbitrary rank, by the ping-pong lemma. 

We assume the reader is familiar with basic elements of coarse geometry: (k, e)- 
quasi-isometries, quasigeodesics, etc. We summarize some of the main properties 
of 5-hyperbolic spaces below: 

Theorem 2.2 (Gromov; basic properties of (5-hyperbolic spaces). Let X be a 8- 
hyperbolic path metric space. 

(1) Morse Lemma. For every fc, e there is a universal constant C(<5, fc, e) such 
that every (k, e)-quasigeodesic segment with endpoints p,q £ X lies in the 
C -neighborhood of any geodesic joining p to q. 

(2) Quasigeodesity is local. For every k, e there is a universal constant 
C(S,k,e) such that every map (f> : K. — » X which restricts on each segment 
of length C to a (fc, e)-quasigeodesic is (globally) (2k,2e)-quasigeodesic. 

(3) Ideal boundary. There is an ideal boundary dX functorially associated to 
X , whose points consist of quasigeodesic rays up to the equivalence relation 
of being a finite Hausdorff distance apart, which is metrizable. If X is 
proper, dX is compact, and any quasi-isometric embedding X — > Y between 
hyperbolic spaces induces a continuous map dX — > dY . 

For proofs and a more substantial discussion, see |12j . 

2.2. Regular languages. We give a brief overview of the theory of regular lan- 
guages and automata. There are many conflicting but equivalent definitions in the 
literature; we use a parsimonious formulation which is adequate for this paper. For 
a substantial introduction to the theory of regular languages see [9] . 
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Definition 2.3. An alphabet is a finite set whose elements are called letters. A 
word (or sometimes string) on an alphabet is a finite ordered sequence of letters. 
A language is a subset of the set of all words on a fixed alphabet. A language is 
prefix closed if every prefix of an element of the language is also in the language. 

If S is an alphabet, the set of all words on S is denoted S* , and a language is a 
subset of S*. 

Convention 2.4. Throughout this paper, if w denotes a word in S* , then Wi will 
denote the prefix of length i and w(i) will denote the zth letter. The length of w is 
denoted \w\. Hence (for example), wq denotes the empty word, and \wi\ = i. 

Definition 2.5. A finite state automaton (or just automaton for short) on a fixed 
alphabet S is a finite directed graph T with a distinguished initial vertex vi (the 
input state), and a choice of a distinguished subset Y of the vertices of T (the accept 
states), whose oriented edges are labeled by letters of S in such a way that at each 
vertex there is at most one outgoing edge with any given label. 

The vertices are called the states of the automaton. A word w in the alphabet 
determines a directed path 7 in an automaton, defined as follows. The path 7 starts 
at the initial vertex at time 0, and proceeds by reading the letters of w one at a 
time, and moving along the directed edge with the corresponding label if one exists, 
and halts if not. If the automaton reads to the end of w without halting, the last 
vertex of 7 is the final state, and the word is accepted if the final state of 7 is an 
accept state, and rejected otherwise. In this paper, by convention, if an automaton 
halts before reaching the end of a word, the word is rejected. 

Definition 2.6. A language L C S* is regular if it is exactly the set of words 
accepted by some finite state automaton. 

A regular language L is prefix-closed if and only if it is accepted by some finite 
state automaton for which every vertex is an accept state. In this paper, we are only 
concerned with prefix-closed regular languages. Hence a word w in S* is accepted 
if and only if it corresponds to a directed path of length \w\. 

Convention 2.7. Throughout this paper, if 7 denotes a directed path in a graph, 
then 7j will denote the initial subpath of length i, and j(i) will denote the ith 
vertex in the path after the initial vertex. The length of 7 is denoted |-y | . Hence 
(for example) 7(0) denotes the initial vertex. 

If r is the underlying directed graph of the automaton, we say that T parame- 
terizes L. If L is prefix closed and parameterized by T, there is a natural bijection 
between elements of L and paths in T starting at the initial vertex. 

Remark 2.8. Some authors by convention add a terminal "fail" state to an automa- 
ton, and add directed labeled edges to this fail state so that at every vertex there 
is a full set of outgoing directed labeled edges, one for each element of the alpha- 
bet. In this paper, automata will only be fed accepted words, so this convention is 
irrelevant. 

An automaton is like a computer with finite resources running a specific program. 
The language in which the program is written only allows bounded loops — i.e. loops 
of the form "for i = 1 to 10 do" , where the parameters of the loop are explicitly 
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spelled out — in order to certify that the computer will not run out of resources 
on execution of the program. 

Describing an automaton in terms of vertices and edges is like describing a pro- 
gram in terms of machine code. It is more transparent, and the result is both more 
human readable and more easily checkable, to describe instead (in more prosaic 
terms) the program that the automaton carries out. 

2.3. Directed graphs and components. Now and in the sequel we refer to di- 
rected graphs by the computer scientists' term "digraphs" . A pointed digraph is a 
digraph with a choice of an initial vertex. 

Definition 2.9. Let T be a digraph. A vertex v is accessible from a vertex u if 
there is a directed path in T from u to v. A component is a maximal subgraph such 
that for any two vertices u, v in the subgraph (not necessarily distinct) the state v 
is accessible from u. 

Remark 2.10. A subgraph in which every vertex is accessible from every other is 
sometimes called recurrent, so a component is just a maximal recurrent subgraph. 
We use both terms interchangeably in the sequel. 

Different components in a digraph are disjoint. If T is a pointed digraph, asso- 
ciated to T is a digraph C(T) whose vertices are components of T, together with 
an edge from one component to the other if there is a directed path in T between 
the corresponding components which does not pass through any other component. 
Note that it is entirely possible that C(T) is empty. 

We can enhance C(T) to a pointed digraph as follows. If the initial vertex v\ 
of r is contained in a component C, the corresponding vertex C of C(T) becomes 
the initial vertex. Otherwise add an initial vertex to C(r), and join it by a path 
to each component C that can be reached in T by a path from the initial vertex 
which does not pass through any other component. 

Lemma 2.11. There are no oriented loops in C(T). 

Proof. If there is a path from C to itself in C(T) passing through C then there is 
a loop in T from some vertex in C to itself passing through C . Hence every vertex 
of C" is accessible from every vertex of C and conversely, contrary to the fact that 
C and C are maximal. □ 

2.4. Combings. Let G be a group and S a finite subset which generates G as a 
semigroup. There is an evaluation map S* — > G taking a word in the alphabet S to 
the corresponding element in G. We denote this map by w — > w. Note that a word 
w 6 S* can equally be thought of as a directed path in the Cayley graph Cs(G) 
from id to w. Note further if S is not symmetric (i.e. S ^ S^ 1 ), the Cayley graph 
Cs(G) should really be thought of as a digraph. 

Convention 2.12. By Convention ^. A\ Wl denotes the element in G corresponding 
to the prefix Wi of w. Hence (for example), Wq = id for any w. By Convention ^. 71 
if 7 denotes the corresponding path in Cs{G) from id to w, then (by abuse of 
notation) we write 7(1) = wl as elements of G. 

Definition 2.13. Let G be generated as a semigroup by a finite subset S. A 
combing of G with respect to S is a prefix-closed regular language L C S* for which 
the evaluation map is a bijection L — > G and such that every element of L is a 
geodesic. 
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Geometrically, let T be a digraph parameterizing L. Let V be the universal cover 
of r, let vi E r be a lift of the initial vertex, and let T C T be the subgraph 
consisting of all points accessible from v~i. There is a natural map T — > Cs(G) 
taking v\ to id and directed paths to directed paths. The statement that L is a 
combing with respect to S corresponds to the geometric statement that the image 
of T is a spanning subtree of Cs(G), and that directed paths in T are taken to 
geodesies in Cs(G). 

Hyperbolic groups admit combings with respect to any finite generating set. In 
fact, let G be hyperbolic and let S be a fixed finite generating set. Choose a total 
order -< on the elements of S, and by abuse of notation, let -< denote the induced 
lexicographic order on S* . That is, w -< w' for words to, w' if there are expressions 
w = xsy, to' = xs'y' for s,s' € S and x, y, y' € S* where s -< s'. 

Let L C S* denote the language of lexicographically first geodesies in G. That 
is, for each g G G, a word w <E S* with uJ = g is in L if and only if |io| = \g\, and 
w -< w' for all other to' £ S 1 * with these properties. It is clear that L is prefix-closed 
and bijective with G. 

Theorem 2.14 (Cannon [5], [9j). Let G be a hyperbolic group, and S a finite 
generating set. The language of lexicographically first geodesies is regular (and 
therefore defines a combing of G with respect to S). 

Remark 2.15. Note that the choice of a digraph T parameterizing a combing L is not 
part of the data of a combing. A given regular language is typically parameterized 
by infinitely many different digraphs. 

Warning. Many different definitions of combing exist in the literature. They are 
often, but not always, assumed to be bijective, and sometimes to be quasigeodesic. 
See |18j for a survey and some discussion. 

3. COMBABLE FUNCTIONS 

3.1. Left and right invariant metrics. Let G be a group with finite generating 
set S (as a semigroup). There are two natural metrics on G associated to S — a 
left invariant metric di which is just the path metric in the Cayley graph Cs(G), 
and a right invariant metric du where dn(a, b) = di,(a , b~ x ). If | ■ | denotes word 
length with respect to S 1 , then 

di(a, b) = |a _1 &| and dn(a,b) = |a6 _1 | 

Remark 3.1. If S is not equal to S^ 1 then this is not strictly speaking a metric, 
since it is not symmetric in its arguments. 

Lemma 3.2. A function f : G — > M is Lipschitz for the d^ metric if and only if 
there is a constant C so that for all a € G and s € S there is an inequality 

\f(as)-f(a)\<C 

Similarly, f is Lipschitz for the da metric if 

\f(sa)-f(a)\<C 

Proof. These properties follow immediately from the definitions. □ 

Remark 3.3. Note the curious intertwining of left and right: a function is Lipschitz 
in the left invariant metric if and only if it changes by a bounded amount under 
multiplication on the right by a generator, and vice versa. 
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3.2. Combable functions. Suppose G is a group with finite generating set S, and 
let L C S* be a combing. Let T be a digraph parameterizing L. There are natural 
bijections 

paths in T starting at the initial vertex < — > L < — ► G 

If w G L, and 7 denotes the corresponding path in T, then recall that Convention ^. 71 
says that is the vertex in T visited after reading the first i letters of w. 

Definition 3.4. Let G be a group with finite symmetric generating set S, and let 
L C S* be a combing. A function <j) : G — > Z is weakly combable with respect to 
S , L (or weakly combable for short) if there is a digraph T parameterizing L, and a 
function e?0 from the vertices of T to Z such that for any w G L there is an equality 

\w\ 

cf>(w) =^#( 7 «) 

i=0 

where 7 is the directed path in T corresponding to w. 

The function <fi is combable if it is weakly combable and Lipschitz in the 
metric, and it is bicombable if it is weakly combable and Lipschitz in both the g?l 
and the da metrics. 

In words, a function (f> is weakly combable if its derivative along elements in a 
combing can be calculated by an automaton. 

Example 3.5. Word length (with respect to a fixed finite generating set S, symmetric 
or not) is bicombable. In fact, if L is any combing for S, and T is any digraph 
parameterizing L, the function d(f> taking the value 1 on every (noninitial) vertex 
of r satisfies the desired properties. 

Example 3.6. Let F be the free group with generators a, b. Let S = {a, 6, a" 1 , b^ 1 }, 
and let L be the language of reduced words in the generating set. The function 

\ \w\ if w begins with a 
4>{w) — < 

I otherwise 

is combable but not bicombable. 

The definition of weakly combable depends on a generating set and a combing. 

Example 3.7. Let G = Z x Z/2Z where Z is generated by a and Z/2Z is generated 
by b. Let S = a, a -1 , b and let L be the combing consisting of words of the form 
a" and ba n for different n. Let S' = b, (ab), (a6) _1 and let L' be the combing 
consisting of words of the form (ab) n and b(ab) n (each bracketed term denotes a 
single "letter" of S'). Let <f> : G — > Z be the function defined by 4>(a n ) — n and 
4>(ba n ) = 0. Then (j> is weakly combable with respect to L, but not with respect to 
L' since the derivative d(f) is unbounded along words of L' . 

However, the property of being (bi-)combable does not depend on a generating 
set, or a choice of combing. 

Lemma 3.8. Let G be a word-hyperbolic group. Suppose S, S' are two finite sym- 
metric generating sets, and L,L' bijective geodesic combings in S* and (S')* re- 
spectively. Then the sets of functions which are (bi- ) combable with respect to L and 
with respect to L 1 are equal. 
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Proof. The property of being Lipschitz in either the left or right metric is indepen- 
dent of a choice of generating set (and does not depend on a combing) , so it suffices 
to prove the lemma for combable functions. Let 4> be combable with respect to S, 
L. We will show it is combable with respect to S' , V . 

Let w' be a word in L' , and let w be the word in L with w' — w. Let 7 be a path 
in Cs(G) corresponding to w, and let 7' be a path in Cs'iG) corresponding to w' . 
Although 7 does not make sense as a path in Cs>(G), nevertheless each vertex 7(1) 
is an element of G, and therefore corresponds to a unique vertex in Cs>{G). The 
"discrete path" determined by the vertices 7(1) is a quasigeodesic in Cs'(G), and 
by the Morse Lemma, fellow travels the path 7'. In other words: 

(1) there is a constant C\ so that every is within distance C\ of some 
and every is within distance C\ of some 7(f) 

(2) there is a constant C2 so that if 7(1) is within distance C\ of then 
7(fc) is within distance C\ of + 1) for some k with \k — i\ < C2 

Here distances are all measured in Cs>(G). 

Suppose we are given some »' 6 I'. A word v in L is good (for v') if it satisfies 
the following properties. Let a' be the path in Cs>(G) associated to v', and a the 
path in Cs(G) associated to v, and think of the vertices a(i) as vertices in Cs'{G). 
Then v is good if it satisfies properties (1) and (2) above (with a and a' in place 
of 7 and 7'), and if furthermore the distance from v' to V is at most C\. Notice if 
uu'j is a prefix of w' , then there is some prefix Wi of w as above that is good. 

Let T' be a digraph parameterizing L' , and let T be a digraph parameterizing L 
for which there is d<f) from the states of L to Z as in Definition 13.41 Let B denote 
the (set of vertices in the) ball of radius C\ about id in Cs>{G). For each i, let jUj 
be the function whose domain is B, and for each b € B define Hi(b) as follows: 

(1) if there is no good word »Gi (for w^) with v = w' i b 1 set fJ.i(b) equal to an 
"out of range" symbol E; otherwise: 

(2) set Hi{b) equal to the tuple whose first entry is the value 0(w-6) — </>(w-) 
and whose second entry is the state of T after reading the word v 

Note that given b, the word v if it exists is unique, since L — > G is a bijection. 
Moreover, since cj> is combable, there is a constant C3 so that 0(w^6) — 0(wj) € 
[— C3, C3]. In other words, the range of \Xi is the finite set E U [— C3, C3] x L where 
for the sake of brevity, and by abuse of notation, we have denoted the set of states 
of r by F (this is the only point in the argument where combability (as distinct 
from weak combability) is used). Finally, observe that /Ui (id) is never equal to E, 
since there is a good v for w[ with v = w^, for every i. 

We claim that the function [ii and the letter w'(i + 1) together determine the 
function /i^+i. For each 6 £ B, the function /ii tells us whether or not there is 
a good word v with v = w'^b, and if there is, the state of V after reading such a 
word. There are finitely many words x of length at most C2 in S* which can be 
concatenated to v to obtain a word vx S L, and the set of such x depends only on 
the state of T after reading v. Each suffix x determines a path (3 in Cs{G) and a 
discrete path v(3(j) in Cs> (G). Whether the composition vx is good for w' i+1 or not 
depends only on the suffix x, so the set of such suffixes can be determined for each 
b e £>, and the set of endpoints determines a subset of vertices b' E B for which 
fii + i(b') 7^ E. For each such b' , the state of F after reading vx can be determined, 
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and the difference <f>(vx) — <fi(v) can be determined by the combability of <f> with 
respect to T. 

We claim further that the function fa and the letter w'(i + l) together determine 
4>(w' i+l ) — </>(Tt;J). For, there is a unique b and good v S L with v = w^b which 
can be extended to vx £ L with vx = w' i+1 . Because we know the state of T after 
reading v, we know <ft(vx) — 4>{v). Moreover, we know (f>(vx) — so we know 

4>(w' i+1 ) — 4>{w'i) as claimed. 

From this data it is straightforward to construct a new T" parameterizing L' 
so that dip exists as a function on (the states of) T" . Every word w' £ L and 
every integer i > determines a function fa, and a state G T' (by abuse of 
notation, but in accordance with Convention ^. 7l we denote the path in V associated 
to w' by 7'), and thereby a triple (fa,~/'(i), 4>iw'i) — 4>(w'i-i)) (with the convention 
<f>(w'-i) — 0). The vertices of T" are the set of all possible triples that occur, over 
all w' £ L' and all i; note that this is a finite set. The directed edges of T" are the 
ordered pairs of triples that can occur for some w' and successive indices i, i + 1. 
Evidently T" parameterizes L'; moreover, d<f> is the function whose value on a triple 
as above is the last co-ordinate. □ 

The sum or difference of two (bi-)combable functions is (bi-)combable, and the 
set of all (bi-)-combable functions on a fixed G is therefore a countably infinitely 
generated free Abelian summand of the Abelian group IP of all Z- valued functions 
on G. 

Example 3.9. Word length with respect to a finite generating set Si is bicombable 
with respect to L where L is a combing with respect to another finite generating 
set S2 (compare with Example 13. 5[) . 

3.3. Quasimorphisms. 

Definition 3.10. Let G be a group. A quasimorphism is a real valued function 
4> on G for which there is a least non- negative real number D{<f) called the deject, 
such that for all a,b £ G there is an inequality 

|0(o) + 0(6) -0(o6)| <D(d>) 

In words, a function is a quasimorphism if it is a homomorphism up to a bounded 
error. 

Quasimorphisms are related to stable commutator length and to 2-dimensional 
bounded cohomology. For a more substantial discussion, see [T] or [3]. 

Lemma 3.11. A weakly combable quasimorphism is bicombable. 

Proof. For any g £ G and s £ S we have 

\4>(j9s)-<t>(g)\<\<K«)\+D(<fi) 

and 

Since s is finite, the conclusion follows. □ 

Conversely, let <fi be bicombable. Then we have the following lemma: 

Lemma 3.12. Let <p be bicombable. Then there is a constant C so that if w £ L 
is expressed as a product of subwords w — uv then \<f>(w) — <j>(u) — 4>fi>)\ ^ C- 
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Proof. Let u' € L be any word so that the paths in T associated to u and to u' end 
at the same vertex. Then <fi(w) = <p(u) + (p{u'v) — 4>{u'). Now choose u' of bounded 
length, and notice that the difference of 4>(u'v) and (p(v) is bounded because of the 
Lipschitz condition. □ 

Using this Lemma we can construct many bicombable quasimorphisms on G. 

Theorem 3.13. Let G be hyperbolic, and let S be a finite generating set for G as a 
semigroup. Let (ps be the bicombable function which counts word length with respect 
to S, and (ps- 1 the bicombable function which counts word length with respect to 
S . Then ips '■= 4>s ~ 'Ps- 1 * s a bicombable quasimorphism. 

Proof. Since each of <ps and (ps- 1 are bicombable, so is their difference. Let g, h be 
arbitrary. Let w,u,v S L correspond to gh,g,h respectively. Then we can write 
u = u'x, v = yv' and w — W1W2 as words in L so that d^^y, x^ 1 ) < S, di(u', wi) < S 
and dji(v' ', w n ) < 8 by i5-thinncss of triangles. Now apply Lemma 13.121 □ 

Remark 3.14. Given an arbitrary bicombable function cf>, one can define if) by ip{g) = 
4>{g) — </>(g _1 ). It is not necessarily true that ip is bicombable, but an argument 
similar to the proof of Theorem 13.131 shows that ip is a quasimorphism. 

3.4. Counting quasimorphisms. Among the earliest known examples of quasi- 
morphisms are counting quasimorphisms on free groups, first introduced by Brooks 
3j. Brooks' construction was generalized to word- hyperbolic groups by Epstein- 
Fujiwara [TP] . 

Definition 3.15. Let G be a hyperbolic group with generating set S. Let a be an 
oriented simplicial path in the Cayley graph Cs(G) and let er -1 denote the same 
path with the opposite orientation. A copy of a is a translate a ■ a for some a G G. 
For a an oriented simplicial path in Cs(G), let \a\ denote the maximal number of 
disjoint copies of a contained in a. For a directed path a in Cs{G) from id to a, 
define 

c CT (a) = |a| - (length(a) - \a\ a ) 

and define 

c a (a) = supc (T (a) 

a 

where the supremum is taken over all directed paths a in Cs(G) from id to a. 
Define a counting quasimorphism to be a function of the form 

4><r{a) := c a (a) - c CT -i(a) 

Remark 3.16. There is no logical necessity to count disjoint copies of a in a rather 
than all copies. The latter "big" counting function agrees with the convention of 
Brooks whereas the former "small" counting function agrees with the convention 
of Epstein-Fujiwara. 

In the sequel, by convention assume that the length of a is at least 2. From the 
definition, <f> a {g) — —(pai^ 1 ) f° r & U 9- A path a which realizes the supremum in 
the formula above is called a realizing path for a. Since the value of each term of 
the form | • |, length(-), | • \ a is integral, a realizing path exists for every a. 

The crucial property of realizing paths for our applications is the following quasi- 
geodesity property: 
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Lemma 3.17 (Epstcin-Fujiwara [10). Let a be a path in Cs(G) of length at least 
2, and let a G G. A realizing path for a is a K, e-quasigeodesic in Cs{G), where K 
and e are universal. 

Note if G is (5-hyperbolic with respect to some S, then a K, e-quasigeodesic is 
within distance C of an actual geodesic, where C depends on K, e, S. In other 
words, a realizing path (asynchronously) fellow-travels any geodesic representative 
of the word. 

3.5. Greedy algorithm. Let a be a string. Let k a count the maximal number 
of disjoint copies of a in a word w, and let k' a count disjoint copies of a in w by 
using the greedy algorithm: that is, there is equality k' a (w) = k' a (v) + 1 where v 
is the word obtained from w by deleting the prefix up to and including the first 
occurrence of a. 

Lemma 3.18 (Greedy is good). The functions k a and k' a are equal. 

Proof. Let w be a shortest word for which k a (w) and k' a (w) are not equal. By 
definition, k' a (w) < k a {w) and since w is shortest, k' a {w) = k a {w) — 1. Since w is 
shortest, the suffix of w must be a copy of a which is counted by k a but not by 
k' a . Hence the greedy algorithm must count a copy of a which overlaps with this 
copy. So deleting the terminal copy of a reduces the values of both k a and k' a by 
1, contrary to the hypothesis that w was shortest. □ 

3.6. Hyperbolic groups. The purpose of this section is to prove that the Epstein- 
Fujiwara counting quasimorphisms are bicombable. The proof has a number of 
structural similarities with the proof of Lemma 13.81 Note that it suffices to show 
that they are weakly combable, since any quasimorphism is Lipschitz in both the 

and the da metrics. 

Theorem 3.19. Counting quasimorphisms on hyperbolic groups are bicombable. 

Proof. Fix a hyperbolic group G and a symmetric generating set S. Let L be 
a combing for G. Remember that this means that L is a prefix-closed regular 
language of geodesies in G (with respect to the fixed generating set S) for which 
the evaluation map is a bijection L — > G. 

Let a be a string, and <$> a the associated counting quasimorphism. It suffices to 
show that c„ is weakly combable. For the sake of convenience, abbreviate c a to c 
throughout this proof. 

Fix a word w £ L, and let 7 be the corresponding path in Cs (G) . By Lemma l3.17[ 
a realizing path 7' for w is a K , e quasigeodesic, and therefore by the Morse Lemma, 
it satisfies the following two conditions (compare with the proof of Lemma !3-8[) : 

(1) there is a constant C\ so that every is within distance C\ of some 
and every "f'(j) is within distance C\ of some 7(1) 

(2) there is a constant C2 so that if is within distance C\ of 7(j), then 
j'(k) is within distance C\ of j(j + 1) for some k with \k — i\ < C2 

Here distances are all measured in Cs(G). Similarly, if a is a geodesic path, say 
that a, K,e quasigeodesic path a' is good for a if they both start at id, if their 
endpoints are at most distance G\ apart, and if they satisfy properties (1) and (2) 
above (with a and a 1 in place of 7 and 7'). 

Let T denote the set of proper prefixes of a. Given a (directed) path a € Cs(G), 
first find all disjoint copies of a in a obtained by the greedy algorithm, and let 
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p(a) E T be the biggest prefix of a, disjoint from these copies, that is a "suffix" of 
(the word corresponding to) a. 

Define a function ^ as follows. Let B denote the (set of vertices in the) ball of 
radius C\ in Cs(G) about the identity. For each i, let ^ be the function whose 
domain is B x T, and for each b € B and t G T define fii(b, t) as follows: 

(1) if there is no good path a (for 7$) with endpoint Wib and p(a) = t, set 
/ii(b,t) equal to an "out of range" symbol E; otherwise: 

(2) set fii(b,t) equal to tuple whose first term is |wJi6| — \Wi\ (i.e. relative distance 
to id), and whose second term is the difference Vi(Wib,t) — c a {Wi), where 
Ui(Wib, t) is the maximum of c a on all good paths a as above with p(a) = t 

A good path ending at Wib can be concatenated with a short path ending at Wi 
and conversely, so there is a constant C3 so that /li takes values in EU [— Ci, Ci] x 
[— 03,0^], a finite set. Every good path ending at some W i+1 b' is obtained by 
concatenating a good path ending at Wib with a path of length at most C2, and 
these can all be enumerated. The relative distance to id on balls of uniform size 
can be determined as one moves along the geodesic 7, by keeping track of only a 
bounded amount of data at each stage (this is the heart of Cannon's argument that 
hyperbolic groups are combable; for details see [5]). As in the proof of Lemma RT8l 
the function \ii on B x T and the letter w(i + 1) together determine the function 
/ij+i, and the difference c a (Wi+i) — c a (Wi). From this it readily follows that c a is 
weakly combable, and (since it is evidently bilipschitz) therefore bicombable. The 
same is true of c_ CT , and therefore of their difference cf) a . □ 

4. ERGODIC THEORY OF BICOMBABLE FUNCTIONS 

Let G be a hyperbolic group with finite generating set S. Let I be a combing 
with respect to S, and T a digraph parameterizing L. Graphs T of this kind are 
not arbitrary, but are in fact quite constrained. The ergodic theory of T can be 
understood using the basic machinery of Markov chains. This material is very well 
understood, but we keep the discussion self-contained and elementary as far as 
possible. Basic references for this material include [2], [H], P~5]> [E] an d [2Qj, and 
at a couple of points we refer the reader to these texts for proofs of standard facts. 

4.1. Almost semisimple directed graphs. Let T be a finite directed graph. Let 
V be the real vector space spanned by the vertices of T, and let (•, •) be the inner 
product on V for which the vertices are an orthonormal basis. We could and do 
think of V as the space of real-valued functions on the vertices of T. Let V ® C 
denote the corresponding vector space of complex-valued functions on the vertices 
of r. Let v\ denote the initial vertex, and enumerate the other vertices somehow 
as v 2 , v 3 , ■ ■ ■ . For a vector v 6 V, let denote the L 1 norm of v. That is, 

\v\ = Y^\(v,Vi)\ 

i 

Definition 4.1. The transition matrix of T (also sometimes called the adjacency 
matrix) is the n x n matrix M whose Mij entry counts the number of directed 
edges from Vi to Vj (or is equal to if there are no such edges) . 

Lemma 4.2. For any Vi, Vj the number of directed paths in T of length n from Vi 
to Vj is (^,M™Vj) = (M n )ij. 
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Proof. This is true for n = 1. Suppose by induction it is true for n = in. The 
number of paths of length m + 1 from Vi to Vj is equal to the sum over all Vk of the 
number of paths of length m from Vi to Vk times the number of paths of length 1 
from v k to Vj. By induction, this is J2k( Mm )^M kj = (M m+1 )y . □ 

By Lemma l4~2l the number of paths in T of length n starting at v% is |(M T )™ui|, 
where M T denotes the transpose of M. One could equally well adopt the nota- 
tion \v 1 M n \ := |(M t )™di| but we prefer the more explicit notation (involving the 
transpose) since there are many opportunities for left-right errors in what follows. 

Definition 4.3. A directed graph T is almost semisimple if it satisfies the following 
properties. 

(1) There is an initial vertex v-i 

(2) For every i ^ 1 there is a directed path in T from V\ to v% 

(3) There are constants A > 1 , K > 1 so that 

K' 1 * 1 < \(M T ) n Vl \ < K\ n 

for all positive integers n 

In what follows we will assume that Y is almost semisimple. 

Lemma 4.4. Suppose Y is almost semisimple. Then A is the largest real eigenvalue 
of M . Moreover, for every eigenvalue £ of M either |£| < A or else the geometric 
and the algebraic multiplicities oft; are egual. 

Proof. It is convenient to work with M T in place of M. To prove the lemma, it 
suffices to prove analogous facts about the matrix M T . Corresponding to the Jordan 
decomposition of M T over C, let £i, . . . ,£ m be the eigenvalues of the corresponding 
Jordan blocks (listed with multiplicity). We consider V <g> C with the hermitian 
inner product. 

Bullet {2j from Definition 14.31 implies that for any Vi , there is an inequality 
\(M T ) n Vi\ < Ci\(M T ) n vi \ for some constant C,-. Since the Vi span V, and since V 
is finite dimensional, there is a constant C such that for all w € V with \w\ = 1 
there is an inequality |(M T )"u;| < C\(M T ) n v 1 \. This is true if w E V <g> C as well. 

For each i, there is some Wi € V <£> C in the ^-eigenspace for which 

\{M T ) n Wi \ > constant • n fc - 1 |^| n 

where k is the dimension of the Jordan block associated to Since \(M T ) n Wi\ < 
C\{M T ) n vi\, by bullet © from Definition either |^| < A or |&| = A and k = 1. 

By the Perron- Frobenius theorem for non-negative matrices (see e.g. [T7], § 3.2 
pp. 23-26), M has a largest real eigenvalue A' such that |£| < A' for all eigenvalues 
£. We must have A' = A by the estimates above. □ 

Note that M and M T have the same set of eigenvalues, with the same multi- 
plicities. Since every eigenvalue of M with largest (absolute) value has geometric 
multiplicity equal to its algebraic multiplicity, the same is true of M T . 

For any vector v S V, decompose v = m t° the components in the 

generalized eigenspaces of the eigenvalues £. Note that (Mu)(£) = M(v (£_)), so we 
write it as Mi>(£). Since any two norms onV ®C are equivalent, there is a constant 
K > 1 such that 

JT-i < 1 M ""1 < K 

-£ 5 |MM£)I " 
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providing M n v ^ 0. 

Lemma 4.5. For any vector v £ V , the following limit 

n 

p(v) := lim n -1 ^ A^ikf-y 
exists and is equal to v(X) G V. 

Proof. We suppress v in the notation that follows. For each eigenvalue £ define 

n 

and set p n — PniO- With this notation, p = linin^oo p n , and we want to show 
that this limit exists, and is equal to v(X). 

By Lemma T4.41 for each £, either |£| < A or v(£) is a ^-eigenvector. In the first 
case, p n (0 —> 0. In the second case, either £ = A, or else the vectors X~ l M l v(£) 
become equidistributed in the unit circle in the complex line of V <8> C spanned by 
It follows that p n (f) -> unless C = A. So n" 1 E 0<l<11 A^W^^) -> unless 
£ = A, in which case p n (A) = w(A) is constant. □ 

The same argument as Lemma 14.51 implies 

Lemma 4.6. For any vector v £ V , the following limit 

n 

i{v) := lim n- 1 ^ X- l (M T Yv 

exists, and is equal to the projection of v onto the left X-eigenspace of M . 

For any Vi, the partial sums p n {vi) are non- negative real vectors so if v is non- 
negative and real, so are p(v) and i(v). 

Lemma 4.7. For any v, W € V, there is an identity 

(£(v),w) = (l(v),p(w)) = (v,p(w)) 

Proof. Let ir denote projection onto the (right) A-eigenspace, so nv — p(v) and 
i{v) = ir T v. Since ir is a projection, it is idempotent; i.e. ir o ir = -- ir, and similarly 
for 7r T . Hence 

(tt t v,w) = (7r T 7r T v,w) — (7r T w,7rw) = (v,mrw) — (v : ttw) 

□ 

4.2. Stationary Markov chains. We would like to be able to discuss the typical 
behavior of an infinite (or sufficiently long) path in T. The standard way to do this 
is to use the machinery of Markov chains. A stationary Markov chain is a random 
process Xo,Xx, x%, ■ ■ ■ where each Xi is in one of a finite set of states V. The process 
starts in one state, and successively moves from one state to another. If the chain 
is in some state u, € V then the probability that it moves to state Vj S V at the 
next step is some number Nij which depends only on i and j; in other words, given 
the present state, the future states are independent of the past states. The starting 
state is determined by some initial probability distribution p. See [11] Chapter 11 
for an introduction to the theory of Markov chains. 
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Our digraph T determines a stationary Markov chain as follows. For each n, let 
X n denote the set of paths in V of length n, starting at an arbitrary vertex, and Y n 
the set of paths of length n starting at the initial vertex. A path of length is a 
vertex. Restricting to a prefix defines surjections X n+ \ — > X n and Y n +i — ► Y n for 
all n. Giving each X n , Y n the discrete topology, we obtain inverse limits X and Y 
which denote the space of infinite paths starting at an arbitrary vertex, and infinite 
paths starting at the initial vertex respectively. 

Topologically, X and Y are Cantor sets. The projections X — > X n and Y — > 
Y n pull back points to open subsets called cylinders, which generate the Borel a- 
algebras of X and Y. 

Definition 4.8. The shift map S : X — » X takes an infinite path to its suffix which 
is the complement of the initial vertex. 

It is bothersome to use the same letter S both for the shift map and for a 
generating set for G, but (unfortunately) both notations are standard, and in any 
case the meaning should always be clear from context. 

Let 1 denote the constant function on T taking the value 1 on every vertex. 
Define a matrix N by 

_ MjjpQk 

if p(l)i and otherwise define Nu = 1 and = for i ^ j. Define a measure 
p on the vertices of T by 

Hi = p(l) l £(v 1 ) l 
where subscripts denote vector and matrix components. 

Lemma 4.9. The matrix N is a stochastic matrix (i.e. it is non-negative, and for 
all i it satisfies Y]j N^j = 1 ) and preserves the measure //. 

Proof. For any i not in the support of p(l) we have Y]. Nij — 1 by fiat. Otherwise 
we have 

MjjpiXh = (Mp(l ))j 
Xp(l)i Xp(l), 

This shows N is a stochastic matrix. To see that it preserves pi, we calculate 



Xp(l) 



Hence N preserves p. □ 

We scale p to be a probability measure. By abuse of notation, we also denote 
this probability measure by p. This probability measure is preserved by N . 

There is an associated probability measure on each X n , which we also denote by 
p, where 

pivioV^ ■ ■•^„) = p(vi„)Ni 0ll Ni li2 ■ ■■N in _ 1 i n 
This defines a measure on each cylinder (i.e. preimage of an element of X n under 
X — ► X n ) and therefore a probability measure p on X. Lemma T4.9I implies that p 
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on X is invariant under the shift map S; i.e. n(A) — /j,(S 1 (A)) for all measurable 
A C X. We call /i the stationary measure. 

Lemma 4.10. Suppose T is almost semisimple, and \i is the stationary measure 
on r for N. The support of fj, is equal to the union of the maximal recurrent 
components ofT whose adjacency matrix has biggest eigenvalue A. Furthermore, if 
C, C are distinct components in the support of \x, there is no directed path from C 
toC 

Proof. Recall that the Landau notation f(x) = Q(g(x)) for non-negative functions 
f,goix means that there are positive constants k\ , ki so that 

k\g{x) < f(x) < k 2 g{x) 

and the Landau notation f{x) = o(g(x)) means that f(x)/g(x) — > as x — ► oo. 

Since fii — p(l)j£(i>i)j, a vertex Vj is in the support of [i if and only if the number 
of paths from v\ to Vj of length e [n — k, n + k] is 9 (A™) for some sufficiently big 
constant k, and the number of outgoing paths from Vj of length S [n — k, n + k] is 
also 9(A n ). Associated to a recurrent component C of T one can form the adjacency 
matrix of the subgraph C, which has a biggest real eigenvalue £(C), by the Perron- 
Frobenius Theorem. Suppose for every directed path 7 from v\ to Vj and for every 
recurrent component C which intersects 7 we have £(C) < A. Then it is easy to see 
that the number of paths from v% to vj of length € [n— k, n + k] is o(A"). Similarly, 
if for every directed path 7 starting at Vj and every recurrent component C which 
intersects 7 we have f (C) < A then the number of outgoing paths from Vj of length 
€ [n — k,n + k] is o(A"). Hence Vj is in the support of /i if and only if there are 
recurrent components C, C with £(C) = £(C) = A such that there is a path from 
v\ to Vj intersecting C, and an outgoing path from Vj intersecting C . Note that 
vj e C = C is explicitly allowed here. If C 7^ C but there is a path from C to 
C , then the number of paths of length n which start in C and end in C has order 
SQ^-j A l A"~ l ) = 0(nA n ), contrary to the hypothesis that V is almost semisimple. 
It follows necessarily that C = C. □ 

The construction of the measure \x and the stochastic matrix N might seem 
unmotivated for the moment. In the next section we shall provide a more geometric 
interpretation of these objects. 

4.3. Word-hyperbolic groups and Patterson-Sullivan measures. We now 

relate this picture to the theory of hyperbolic groups. Let G be a non-elementary 
hyperbolic group with generating set S. 

Definition 4.11. The Poincare series of G is the series 

Cg(s) = $>- s l*l 
gee 

This series diverges for all sufficiently small s, and converges for all sufficiently 
large s. The critical exponent is the supremum of the values of s for which the 
series diverges. 

Theorem 4.12 (Coornaert, [!6J Thm. 7.2). Let G be a non- elementary hyperbolic 
group with generating set S. Let G n be the set of elements of word length n. Then 
there are constants A > 1, K > 1 so that 

K- X \ n < \G n \ < KX n 
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for all positive integers n. 

It follows from Theorem 14.121 that the critical exponent of the Poincare series is 
equal to log(A), and the series Cc(log(A)) diverges. 

Example 4.13. Not every group with a geodesic combing satisfies such an inequality. 
For example, let G — F^ x F2 with the standard generating set. Then we can 
estimate 

K^nT < \G n \ < KnT 
for suitable K > 0. On the other hand, the language of words uv where u is a 
reduced word in the first F2 factor and v is a reduced word in the second factor 
defines a geodesic combing. 

For each n, let v n be the probability measure on G defined by 

l^\g\<n A °9 

Vn ~ E,„<„a-i»i 

where 8 g is the Dirac measure on the element g. The measure v n extends trivially 
to a probability measure on G U dG, where dG denotes the Gromov boundary of 
G. 

Definition 4.14. A weak limit v :— linin^oo v n is a Patterson-Sullivan measure 
associated to S. 

In fact, the limit in Definition 14. 141 exists, and an explicit formula for a closely 
related measure v (which differs from v only by scaling) will be given shortly. Note 
that the support of v is contained in dG, since the Poincare series diverges at the 
critical exponent A. 

Let L be a combing of G with respect to S, and let T be a digraph. From 
Coornaert's Theorem we immediately deduce 

Lemma 4.15. Let T be as above. Then T is almost semisimple. 

A slightly better normalization of the measures v n are the sequence of measures 
v n defined by the formula 

Vn = - E AH9l(5 9 
n 

lsl<« 

and let v be a weak limit (supported on dG). 

Lemma 4.16. Any two weak limits v,v on dG as above are absolutely continuous 
with respect to each other, and satisfy l/K < dV/dv < K for some K > 1. 

Proof. The lemma is true for each v n , v n by Theorem 14. 121 and therefore it is also 
true for their limits. □ 

The measure v is not a probability measure in general, though it is finite. 

The group G acts on itself by left multiplication. This action extends continu- 
ously to a left action G x dG — > dG. Patterson-Sullivan measures enjoy a number 
of useful properties, summarized in the following theorem. 

Theorem 4.17 (Coornaert, [6] Thm. 7.7). Let v be a Patterson-Sullivan measure. 
The action of G on dG preserves the measure class of v . Moreover, the action of 
G on dG is ergodic with respect to v. 
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Here we say that a group action which preserves a measure class is ergodic if for 
all subsets A, B C dG with positive ^-measure, there is g € G with v{gA n B) > 0. 
By Lemma 14.161 the action of G on <9G preserves the measure class of v and acts 
ergodically. 

Remark 4.18. In fact, Coornaert shows that there is some constant K > 1 such 
that for any s € S there is an inequality 

K- 1 < ^ < K 
dv 

i.e. the action is "quasiconformal" . It follows that the same is true for the v 
measure, with the same constant K . We will not use this fact in this paper. 

We now relate the measure v to the measure \x and the stochastic matrix N 
defined in § IP1 

There is a natural map Y n — > G for each n, taking a path in T to the endpoint of 
the corresponding geodesic path in G which starts at id. The measures v n determine 
a measure on Y as follows. 

For each g £ G, let cone(<?) denote the set of g' € G such that the geodesic for g 1 
in the combing passes through g. Let p : Y — > Y n take an infinite path to its prefix 
of length n. If y G Y n corresponds to g G G, define 

"{P^iv)) = lim ^ n (cone(5)) 

n — >oo 

Since the cylinders p _1 (y) define the Borel u-algebra on Y, this defines a measure 
V on Y. There is a map Y — > 9G taking an infinite path to the endpoint of the 
corresponding geodesic in G. The pushforward of v on Y under this map agrees 
with v on dG. 

We give an explicit formula for T> m (cone{g)) for any g. By definition, for any 
g G G n and any m > n we have 

9 m (cone( 3 )) = - V A"^ 
m ' — ' 

/iGconc(p) 
\h | < m 

If w g € F is the vertex corresponding to the endpoint of the path for the element 
y G Y n corresponding to g, then we can rewrite this as 

£ m (cone( 5 )) = -\~ n V \- l ((M T Yv g: l) 
m * — ' 

i<m—n 

taking a limit as m — » oo, we get 

= A-(^K),1) = X- n {v g ,p(l)) = A->(1) 3 

Measures are pushed forward by maps, so S*v is the measure on X satisfying 

S*v(B) = v(S^ 1 (B)) 

for any Borel B C X, and similarly for positive powers S l . 

The subset Y <Z X contains the support of v. By convention, V\ has no incoming 
edges, so Y is disjoint from SY, and therefore v is not invariant under S. However, 
the limit 

- 1 
lim N —Slv 

n — ^oo ^ — ' 7^ 

j=l 
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is manifestly S invariant. The fact that this limit exists is contained in the proof 
of the next Lemma. 

Lemma 4.19. The measure lim^^^ Y^"=i on ^ * s eaua ^ t° 'measure p 

on X defined on cylinders using the measure p onT and N as in § \4-'A 

Proof. We identify Xo = T, and observe that the measure p on T agrees with the 
measure p on cylinders p~ l {v) of X for v G T. For each vertex Vi of T we can 
calculate 

veY n 

S n y=Vi 

On the other hand, the number of y € Y n with S n y = Vi is exactly equal to the 
number of directed paths in T of length n which start at v\ and end at Vi, which is 
(vi,M n Vi). Hence 

" 1 1 
lim V-S^p" 1 (<;*)) = lim(« i! p(l))(«i,-VA-"M"u i ) 

n — >-oo * * Ti n—>-oo fi ' ^ 

i—l i<n 

= (vi,p(i))(vi,p(vi)) =p( 1 )^K)i =w 

This shows that the measures agree on the cylinders p -1 (i>i) for Vi E T = Xq- By 
the defining property of /i on X, it suffices to show that for every y € Y whose nth 
vertex is Vi, the v probability that the (n + l)st vertex is Vj is Nij. If y n is a prefix 
of y of length n whose last vertex is Vi, and g 6 G n is the corresponding element, 
we need to calculate 

Hm Efe^m(cone(fe)) 
m-»oo 9 m (cone((7)) 
where the sum is taken over all h 6 G„+i n cone(<7) corresponding to y' 6 
whose last vertex is Uj. For each such g, the number of /i is My so this ratio is 
equal to Mijp(l)j/Xp(l)i which equals as claimed. □ 

In fact we can be even more precise about the relationship between v and p. If 
y € Y n is a prefix whose last vertex projects to some v £ T in the support of p, then 
S n maps the cylinder p~ l (y) of Y to the cylinder X v of X consisting of paths with 
initial vertex v, and from the definitions, the pushforward of ^\p-if y ) to X under 
S n is proportional to p\x v - 

4.4. Central Limit Theorem for stationary Markov chains. The stationary 
measure p on T defined in § 14.21 has support equal to a finite union of maximal 
recurrent components of T, by Lemma [4.101 Label these components C l , and let 
p 1 be the probability measure on C l which is proportional to p\c i - The matrix N 
preserves p l , so it makes sense to talk about a random walk on C % with respect to 
N. 

Recall that d<j) exists as a function on the vertices of T. The stationary Markov 
chain defined by the matrix N defines a random sequence xq, X\,X2, ■ ■ ■ of vertices 
in C l where xq is chosen randomly from the vertices of C % with respect to the 
measure p z , and at each subsequent stage if X{ corresponds to a vertex Vj, the 
probability that Xi+\ will correspond to the vertex Vk is equal to Njk- A sequence 
xq, xi, X2, ■ ■ ■ obtained in this way is called a random walk on C l . If X l n denotes the 
subset of X n consisting of paths of length n which start at some vertex in C l , then 
the measure p on X n restricted to X % n can be scaled to a probability measure p l on 
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X' L n . In this notation, a random walk on C l of length n corresponds to a random 
element of X % n . 

Let S % n be the sum of the values of d<f> on a random walk on C of length n. 
Technically, S n is a distribution on Z. There is a map X l n — > Z taking each walk 
xq, Xi,X2, • • ■ to the sum of d(/> on the corresponding vertices in T. The pushfor- 
ward of the measure // on X % n is the distribution S n . In this context, the Central 
Limit Theorem for stationary Markov chains, due essentially to Markov, takes the 
following form: 

Theorem 4.20 (Markov). Let E l be the integral of dip with respect to the probability 
measure \i l . Then there is some a 1 > for which 

,. / Sl-riE i \ 1 [' x 2 /2i 
lim P r < , < s \ = —== \ e~ x /2 dx 

n ^°° \ ^n(o 1 ) 2 J V27T J r 

where P denotes probability with respect to the measure \x l . 

Said another way, there is a convergence in the sense of distribution 

n- 1/2 (S l n -nE l )-> N{Q,a l ) 

where N(Q, a 1 ) denotes a normal (i.e. Gaussian) distribution with mean and 
standard deviation a 1 . The standard deviation a 1 can be computed easily from /j, 
and N, and is an algebraic function of the entries. Hence in particular, E l and a 1 
are algebraic and effectively computable. 

For a proof, see e.g. [2D], p. 231, and see Chap. 4, § 47 for a derivation of a 
number of formulae for a 1 . 

4.5. Ergodicity at infinity and CLT for bicombable functions. To derive a 
central limit theorem for cf>, we need to be able to compare means E % and standard 
deviations a 1 associated to different components C % . To do this we introduce the 
idea of a typical path (with respect to the function (£>). 

For any real number r, let S(r) denote the probability measure on R consisting 
of an atom of mass 1 concentrated at r. Let 7 be an infinite geodesic ray in G. For 
some fixed real number E and for integers n, m we can define the distribution 

m 

«(n, m)( 7 ) := V - 6 ((0( 7l +«) - #7i) ~ nE)^ 1 ' 2 ) 

i=l 

Definition 4.21. A geodesic ray 7 is E, er-typical if the limit 

w( 7 ) := lim lim w(n,m)(7) 

n — >oo rn — >oo 

exists in the sense of distribution, and is equal to N(0,a). An element y 6 Y is 
E, cr-typical if the corresponding geodesic ray in G is E, cr-typical. 

We parse this condition as follows. For each segment of 7 of length n (contained 
between 7(0) and 7(n + m)), we look at the difference of 4> on the endpoints. In this 
way we obtain a set of m numbers, which we think of as an probability measure 
on R, by weighting each number equally. We then translate the origin by nE, 
and rescale the £-axis by a factor of n" 1 / 2 , and consider the resulting probability 
measure u>(n, m)^). For fixed n, we consider the limit w(n)(7) (if it exists); this 
distribution is the (rescaled) difference of <fi on the endpoints of a "random" segment 
of 7 of length n. Finally, we take the limit as n — > 00 (the rescaling is done so that 
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this limit typically exists). The result a; (7) is the difference of <f> on the endpoints 
of a "long random" segment of 7, translated and scaled. 

For each component C'\ let Y % be the subset of Y consisting of paths which 
eventually enter the component C l and never leave. Note that Y l and Y^ are 
disjoint if i ^ j. 

Lemma 4.22. The measure v(Y — UjY"*) = 0. Moreover, v a.e. y £ Y l are 
E l , a 1 -typical, where E l ,a % are as in Theorem \4.20\ 

Proof. The number of paths in T of length n which never enter a component C 
with £(C) = A is o(A n ), so from the definition of v it follows that v a.e. y £ Y 
enters some C l . If a path leaves some component with £ = A it can never enter 
another one with £ = A, so v a.e. y £ Y which enter some C l never leave. 

We now prove the second statement of the Lemma. The following proof was 
suggested by Shigenori Matsumoto. 

We fix the notation below for the course of the Lemma (the reader should be 
warned that it is slightly incompatible with notation used elsewhere; this is done 
to avoid a proliferation of subscripts). Let Ci be a component of T with Perron- 
Frobenius eigenvalue A. Let Yi be the set of infinite paths in T starting at v\ 
that eventually stay in Cj, and let Xi be the set of infinite paths in Cj. There is a 
measure Jli on Xj obtained by restricting fi on X. The measure jli is determined by 
a stationary measure fa on Cj and a transition matrix N(i), by restricting fi and N. 
The measure fa is stationary for N(i) (i.e. fjjN(i) = fif), so fa is shift invariant. 
Since Ci is a component, fif is the only eigenvector of N(i) with eigenvalue 1, so fa 
is extremal in the space of stationary measures. Therefore by the random ergodic 
theorem (see e.g. [17], Chapter 10) the measure J2i on X{ is ergodic. 

Now, there is a subset X* of Xi of full measure such that for all 7 £ X* 

^ m 

— / . ^S fc 7 — * Mi 

in the weak* topology. On the other hand, on Yi there is a measure Pj which is the 
restriction of v. Define q : Yi — > Xi by 

3 ( 7) = 5 n(7) (7 ) 

where n : 1^ — ► N satisfies the following condition. Let 7r : Xj — > Ci take each 
infinite walk to its initial vertex. Choose n so that n o q : Yi — ► Cj sends the 
measure ^ on 5^ to a measure on Cj of full support. The measure q*Vi on Xi 
is obtained from an initial measure /j, q and the transition matrix N(i) as in § 14.21 
consequently qS>i and are quasi-equivalent (i.e. each is absolutely continuous 
with respect to the other). 

It follows that Y* := q^ 1 (X*) has full measure with respect to Di, and if 7 £ Y* , 
then 

^ m 

— / > ~ > Mi 
TO ^— ' ' 



By Theorem 14.201 it follows that the geodesic ray in G associated to any 7 £ Y* is 
-E 4 , cr l -typical, and the lemma is proved. □ 

Up to this point, we have only used the weak combability of <f>. The next Lemma 
requires bicombability. 
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Lemma 4.23. Let 7 be an E,a-typical geodesic ray in G. If (f> is combable, and if 
7' is a geodesic ray with the same endpoint in dG as 7, then 7' is also E, a-typical. 
If <j) is bicombable then for any g £ G the translate gj is E, a-typical. 

Proof. Let 7 and 7' have the same endpoint. Then there is a constant C such that 
^i(7ij < C and therefore \4>(^i) — <X7;)I — K f° r some K independent of i. This 
shows that 7' is E, a-typical if 7 is. Similarly, if g £ G then d^gji,^) < C and 
therefore \4>(gji) — 4>{li)\ < ^ for some K independent of i. □ 

For each component C l , let d l G denote the image of the E l , cr'-typical y £ Y l 
under Y -> <9G. Note that P(<9 l G) > for all i. By Theorem HUT] and LemmaES 
for any i,j there is g £ G with v(gd l G C 9 J G) > 0. Hence by Lemma T4.23I there is 
an infinite geodesic ray in G which is both E l ,a l typical and E^ , cr J typical. But 
this implies E % — E J and a 1 = a' J . Since i and j were arbitrary, we have proved 

Lemma 4.24. The means E l and standard deviations a 1 as above are all equal to 
some E, a. 

Recall that the measure v on Y pushes forward to a measure on Y n by projection 
Y — > Y n and then to a measure v on G n for all n. Note that v on G n does not 
depend on the particular choice of a digraph T parameterizing L, but just on L 
itself. Lemma T4.24I and Theorem 14.201 together imply our main result: 

Theorem 4.25 (Central Limit Theorem for bicombable functions). Let <fi be a 

bicombable function on a word hyperbolic group with respect to some combing L. 
Let 4> n be the value of <p on a random word of length n with respect to the v measure. 
Then there is convergence in the sense of distribution 

lim n" 1/2 (0„ - nE) -> N(0, a) 

n — >oo 

for some a > where E denotes the mean of d<f> on T with respect to the stationary 
measure fi. In particular, E and a are algebraic. 

Proof. Fix some g S G m which corresponds to an element y £ Y m such that the 
last vertex v g = S m y S X$ — T is in the support of /1. For any n > m, the shift 
map S m takes the subset C Y n with its v measure isomorphically to the set 

of walks in X of length n — m starting at v g with their /i measure, up to scaling 
the measures by a constant factor. Fixing g and letting n go to infinity, we obtain 
the desired Gaussian distribution for the values of <f> on cone(g) fl G n as n — > 00 in 
the v measure. 

For any e, there is m such that the set of g £ G m with v g not in the support of 
fi has v measure at most e. Since e was arbitrary, we are done. □ 

The following corollary does not make reference to the measure v. 

Corollary 4.26. Let <f> be a bicombable function on a word-hyperbolic group G. 
Then there is a constant E such that for any e > there is a K and an N so 
that if G n denotes the set of elements of length n > N , there is a subset G' n with 
\G' n \/\G n \ > 1 — e. so that for all g £ G' n , there is an inequality 

\4>{g) -nE\<K-Vn~ 

As a special case, let S\,S2 be two finite generating sets for G. Word length 
in the S2 metric is a bicombable function with respect to a combing Li for the Si 
generating set. Hence: 
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Corollary 4.27. Let Si and S2 be finite generating sets for a word-hyperbolic group 
G. There is an algebraic number Ai,2 such that for any e > 0, there is a K and an 
N so that if G n denotes the set of elements of length n > N, there is a subset G' n 
with \G' n \/\G n \ > 1 — e, so that for all g £ G' n there is an inequality 

\\g\si ~ Ai, 2 |ff|s 2 | <K-yfn 

Remark 4.28. A counting argument shows Ai ; 2 > logAi/logA2 where the number 
of words of length n in the Si metric is 0(A"). But equality should only hold in 
very special circumstances, and in fact the number A1.2 log A2 log Ai _ is probably 
very interesting in general. Such numbers have been studied in free groups in the 
special case where Si and £2 are free bases, by [14] . 

If is a quasimorphism, then \<fi(g) + <p(g~ 1 )\ < const, so if S is symmetric, then 
necessarily E as above is equal to 0. Hence: 

Corollary 4.29. Let (j) be a bicombable quasimorphism on a word-hyperbolic group 
G. Let <f> n be the value of (p on a random word of length n with respect to the v 
measure. Then there is convergence in the sense of distribution 

lim n- 1/2 n -► N(0, a) 

n — >oo 

for some a > 0. 

5. Holder quasimorphisms 

The distribution of quasimorphisms on free groups has been studied by Horsham- 
Sharp. We briefly describe their results. Let F denote a free group (on some finite 
generating set). 

Definition 5.1. For any g,a G F and any ip : F — > M define 

A o ^0) = i>{g) - ^{ag) 

For x, y € F let (x\y) denote the Gromov product 

(x\y) := Qx\ + \y\-\x- l y\)/2 

A quasimorphism ip on F is Holder if for any a € F there are constants C, c > 
such that for any x,y € F there is an inequality 

\A a ^(x) - A a ^(y)\ <Ce~<"M 

Note that the constants C, c depend on a but not on x or y. The main theorem 
of [13], which also appeared in Matthew Horsham's PhD thesis, is the following: 

Theorem 5.2 (Horsham-Sharp [13 ). Let ip be a Holder quasimorphism on a free 
group. Ifip n denotes the value of tp on a random word in F of length n (with respect 
to a standard generating set) then there is convergence in the sense of distributions 

for some a . 

These results can also be generalized to fundamental groups of closed (hyper- 
bolic) surfaces. Big (Brooks) counting quasimorphisms on free groups are Holder, 
since A a tp(x) = A a ip(y) whenever (x\y) is bigger than \a\. But small counting 
quasimorphisms (i.e. counting quasimorphisms where copies must be disjoint) are 
not, as the following example shows: 
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Example 5.3. Let <j) be the small counting quasimorphism for the word abab. Then 
<p( babab ■ ■ ■ ab ) — n, (j){ ababa b ■ ■ ■ ab) = n 

4n+l in+2 

but 

4>( babab^ ■ ■ ab ) = n, <j){ ababab ■ ■ ■ ab ) = n + 1 

4n+3 4n+4 

Of course, since small counting quasimorphisms are bicombable, Theorem 14.251 
applies. Note that the measure v agrees with the uniform measure in a free group 
with the standard generating set. 

It seems plausible that Horsham-Sharp's methods apply to arbitrary Holder 
quasimorphisms on hyperbolic groups, though we have not pursued this. 
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